428 research outputs found

    Discrete Visual Perception

    Get PDF
    International audienceComputational vision and biomedical image have made tremendous progress of the past decade. This is mostly due the development of efficient learning and inference algorithms which allow better, faster and richer modeling of visual perception tasks. Graph-based representations are among the most prominent tools to address such perception through the casting of perception as a graph optimization problem. In this paper, we briefly introduce the interest of such representations, discuss their strength and limitations and present their application to address a variety of problems in computer vision and biomedical image analysis

    Modeling the structure of multivariate manifolds: Shape maps

    Get PDF
    We propose a shape population metric that reflects the interdependencies between points observed in a set of examples. It provides a notion of topology for shape and appearance models that represents the behavior of individual observations in a metric space, in which distances between points correspond to their joint modeling properties. A Markov chain is learnt using the description lengths of models that describe sub sets of the entire data. The according diffusion map or shape map provides for the metric that reflects the behavior of the training population. With this metric functional clustering, deformation- or motion segmentation, sparse sampling and the treatment of outliers can be dealt with in a unified and transparent manner. We report experimental results on synthetic and real world data and compare the framework with existing specialized approaches. 1

    Building detection in very high resolution multispectral data with deep learning features

    Get PDF
    International audienceThe automated man-made object detection and building extraction from single satellite images is, still, one of the most challenging tasks for various urban planning and monitoring engineering applications. To this end, in this paper we propose an automated building detection framework from very high resolution remote sensing data based on deep convolu-tional neural networks. The core of the developed method is based on a supervised classification procedure employing a very large training dataset. An MRF model is then responsible for obtaining the optimal labels regarding the detection of scene buildings. The experimental results and the performed quantitative validation indicate the quite promising potentials of the developed approach

    Learning Grammars for Architecture-Specific Facade Parsing

    Get PDF
    International audienceParsing facade images requires optimal handcrafted grammar for a given class of buildings. Such a handcrafted grammar is often designed manually by experts. In this paper, we present a novel framework to learn a compact grammar from a set of ground-truth images. To this end, parse trees of ground-truth annotated images are obtained running existing inference algorithms with a simple, very general grammar. From these parse trees, repeated subtrees are sought and merged together to share derivations and produce a grammar with fewer rules. Furthermore, unsupervised clustering is performed on these rules, so that, rules corresponding to the same complex pattern are grouped together leading to a rich compact grammar. Experimental validation and comparison with the state-of-the-art grammar-based methods on four diff erent datasets show that the learned grammar helps in much faster convergence while producing equal or more accurate parsing results compared to handcrafted grammars as well as grammars learned by other methods. Besides, we release a new dataset of facade images from Paris following the Art-deco style and demonstrate the general applicability and extreme potential of the proposed framework

    Cooperative Object Segmentation and Behavior Inference inImage Sequences

    Get PDF
    In this paper, we propose a general framework for fusing bottom-up segmentation with top-down object behavior inference over an image sequence. This approach is beneficial for both tasks, since it enables them to cooperate so that knowledge relevant to each can aid in the resolution of the other, thus enhancing the final result. In particular, the behavior inference process offers dynamic probabilistic priors to guide segmentation. At the same time, segmentation supplies its results to the inference process, ensuring that they are consistent both with prior knowledge and with new image information. The prior models are learned from training data and they adapt dynamically, based on newly analyzed images. We demonstrate the effectiveness of our framework via particular implementations that we have employed in the resolution of two hand gesture recognition applications. Our experimental results illustrate the robustness of our joint approach to segmentation and behavior inference in challenging conditions involving complex backgrounds and occlusions of the target objec

    EnzyNet: enzyme classification using 3D convolutional neural networks on spatial representation

    Full text link
    During the past decade, with the significant progress of computational power as well as ever-rising data availability, deep learning techniques became increasingly popular due to their excellent performance on computer vision problems. The size of the Protein Data Bank has increased more than 15 fold since 1999, which enabled the expansion of models that aim at predicting enzymatic function via their amino acid composition. Amino acid sequence however is less conserved in nature than protein structure and therefore considered a less reliable predictor of protein function. This paper presents EnzyNet, a novel 3D-convolutional neural networks classifier that predicts the Enzyme Commission number of enzymes based only on their voxel-based spatial structure. The spatial distribution of biochemical properties was also examined as complementary information. The 2-layer architecture was investigated on a large dataset of 63,558 enzymes from the Protein Data Bank and achieved an accuracy of 78.4% by exploiting only the binary representation of the protein shape. Code and datasets are available at https://github.com/shervinea/enzynet.Comment: 11 pages, 6 figure

    High-Level Bottom-Up Cues for Top-Down Parsing of Facade Images

    Get PDF
    International audienceWe address the problem of parsing images of building facades. The goal is to segment images, assigning to the resulting regions semantic labels that correspond to the basic architectural elements. We assume a top-down parsing framework is developed beforehand, based on a 2D shape grammar that encodes a prior knowledge on the possible composition of facades. The algorithm explores the space of feasible solutions by generating the possible configurations of the facade and comparing it to the input data by means of a local, pixel- or patch-based classifier. We propose new bottom-up cues for the algorithm, both for evaluation of a candidate parse and for guiding the exploration of the space of feasible solutions. The method that we propose benefits from detection-based information and leverages on the similar appearance of elements that repeat in a given facade. Experiments performed on standard datasets show that this use of more discriminative bottom-up cues improves the convergence in comparison to state-of-the-art algorithms, and gives better results in terms of precision and recall, as well as computation time and deviation
    • …
    corecore